Add support for Coqui TTS #59

Cabeda · 2024-04-02T12:55:09Z

As the title says this PR adds a new provider supporting the Coqui TTS.

The default model, Tacotron2, works very similar to EdgeTTS although it only has a single voice option for now. The power of this provider is the possibility of supporting multiple open TTS models with some very powerful like jenny.

Another interesting feature is voice dubbing with the likes of XTTS V2. There's a bug on sentences longer than 400 tokens for now though. To support voice dubbing I've added a folder with 3 voice samples and defaulted to the male one. Additionally, in this mode multiple languages are supported. As the options are different than the ones on --language I've added a new option named --coqui_language.

For this version the provider supports the same audio formats as edgeTTS thanks to pydub.

Note: To run coqui TTS it will always download the AI model to run. This can go from a few MB to more than 1 GB

Cabeda · 2024-04-04T09:20:58Z

@p0n1 do you have time to give your thoughts on this PR?

p0n1 · 2024-04-04T09:48:48Z

@p0n1 do you have time to give your thoughts on this PR?

Hi @Cabeda Thank you for the great work. I just had a surgery and am still recovering at hospital. Will review the code whenever I feel better.

Cabeda · 2024-04-04T10:20:05Z

No probs! Hope for the best 💪🏼

kelvin-homann · 2024-04-12T15:37:02Z

Have you tried building the Docker image from the docker file using this? I checked out your repository but apparently its missing gcc and the rust compiler. I think another image is needed to install TTS in the docker image

Bryksin

It would be nice if you would add a link to some guide on how to set Coqui in the README file so I could set it up and test it before merging.

Bryksin · 2024-08-24T22:42:29Z

audiobook_generator/tts_providers/base_tts_provider.py



 def get_tts_provider(config) -> BaseTTSProvider:
    if config.tts == TTS_AZURE:
-        from audiobook_generator.tts_providers.azure_tts_provider import AzureTTSProvider
+        from audiobook_generator.tts_providers.azure_tts_provider import \


no functional change, just cosmetics, not needed

Bryksin · 2024-08-24T22:42:48Z

audiobook_generator/tts_providers/base_tts_provider.py

        return AzureTTSProvider(config)
    elif config.tts == TTS_OPENAI:
-        from audiobook_generator.tts_providers.openai_tts_provider import OpenAITTSProvider
+        from audiobook_generator.tts_providers.openai_tts_provider import \


no functional change, just cosmetics, not needed

Bryksin · 2024-08-24T22:42:55Z

audiobook_generator/tts_providers/base_tts_provider.py

        return OpenAITTSProvider(config)
    elif config.tts == TTS_EDGE:
-        from audiobook_generator.tts_providers.edge_tts_provider import EdgeTTSProvider
+        from audiobook_generator.tts_providers.edge_tts_provider import \


no functional change, just cosmetics, not needed

Bryksin · 2024-08-24T22:51:13Z

main.py

@@ -94,23 +94,23 @@ def handle_args():
        help='''
            Speaking rate of the text. Valid relative values range from -50%%(--xxx='-50%%') to +100%%. 
            For negative value use format --arg=value,
-        '''
+        ''',


it is last argument in function, comma not required

Bryksin · 2024-08-24T22:51:18Z

main.py

    )

    edge_tts_group.add_argument(
        "--voice_volume",
        help='''
            Volume level of the speaking voice. Valid relative values floor to -100%%.
            For negative value use format --arg=value,
-        '''
+        ''',


it is last argument in function, comma not required

Bryksin · 2024-08-24T22:51:21Z

main.py

    )

    edge_tts_group.add_argument(
        "--voice_pitch",
        help='''
            Baseline pitch for the text.Valid relative values like -80Hz,+50Hz, pitch changes should be within 0.5 to 1.5 times the original audio.
            For negative value use format --arg=value,
-        '''
+        ''',


it is last argument in function, comma not required

Cabeda added 19 commits April 1, 2024 09:01

Auto commit

28dd92f

Auto commit

06da573

Auto commit

14e6762

Auto commit

384cc4e

Auto commit

0711cb0

Auto commit

d829ecf

Auto commit

4f8bedc

Auto commit

b098208

Remove samples

c7178c9

Remove lost sols example

5ab8749

Remove ignored files

854230e

support coqui multi lingual

4e88170

fix conflict

c4f1b74

Add readme examples

ad1edec

Revert format changes

180671e

revert format

e4ceee2

Add support for the same output formats of edge tts

a2fc0c1

Write the wav file to a tmp folder

c38951a

%fix rename to coqui_tts_provider

521bfec

Bryksin requested changes Aug 24, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Coqui TTS #59

Add support for Coqui TTS #59

Cabeda commented Apr 2, 2024 •

edited

Loading

Cabeda commented Apr 4, 2024

p0n1 commented Apr 4, 2024

Cabeda commented Apr 4, 2024

kelvin-homann commented Apr 12, 2024

Bryksin left a comment

Bryksin Aug 24, 2024

Bryksin Aug 24, 2024

Bryksin Aug 24, 2024

Bryksin Aug 24, 2024

Bryksin Aug 24, 2024

Bryksin Aug 24, 2024

Add support for Coqui TTS #59

Are you sure you want to change the base?

Add support for Coqui TTS #59

Conversation

Cabeda commented Apr 2, 2024 • edited Loading

Cabeda commented Apr 4, 2024

p0n1 commented Apr 4, 2024

Cabeda commented Apr 4, 2024

kelvin-homann commented Apr 12, 2024

Bryksin left a comment

Choose a reason for hiding this comment

Bryksin Aug 24, 2024

Choose a reason for hiding this comment

Bryksin Aug 24, 2024

Choose a reason for hiding this comment

Bryksin Aug 24, 2024

Choose a reason for hiding this comment

Bryksin Aug 24, 2024

Choose a reason for hiding this comment

Bryksin Aug 24, 2024

Choose a reason for hiding this comment

Bryksin Aug 24, 2024

Choose a reason for hiding this comment

Cabeda commented Apr 2, 2024 •

edited

Loading